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Abstract 

We present an efficient general method for realizing a quantum walk operator corresponding 
to an arbitrary sparse classical random walk. Our approach is based on Grover and Rudolph's 
method for preparing coherent versions of efficiently integrable probability distributions [2]. 
This method is intended for use in quantum walk algorithms with polynomial speedups, whose 
complexity is usually measured in terms of how many times we have to apply a step of a quan- 
tum walk pp, compared to the number of necessary classical Markov chain steps. We consider 
a finer notion of complexity including the number of elementary gates it takes to implement 
each step of the quantum walk with some desired accuracy. The difference in complexity for 
various implementation approaches is that our method scales linearly in the sparsity parameter 
and poly-logarithmically with the inverse of the desired precision. The best previously known 
general methods either scale quadratically in the sparsity parameter, or polynomially in the 
inverse precision. Our approach is especially relevant for implementing quantum walks corre- 
sponding to classical random walks like those used in the classical algorithms for approximating 
permanents [7] and sampling from binary contingency tables [8]. In those algorithms, the 
sparsity parameter grows with the problem size, while maintaining high precision is required. 

1 Introduction 

For many tasks, such as simulated annealing OS], computing the volume of convex bodies [5] and 
approximating the permanent of a matrix [6j [7] (see references in [9] for more) , the best approaches 
known today are randomized algorithms based on Markov chains (random walks) and sampling. A 
Markov chain on a state space £ is described by a stochastic matrix P = (p X y)x,ye£- Its entry p xy is 
equal to the probability of making a transition from state x to state y in the next step. If the Markov 
chain P is ergodic (see e.g. [10]), then there is a unique probability distribution it = (tt x ) x( z£ such 
that ttP = it. This probability distribution is referred to as the stationary distribution. Moreover, 
we always approach tt from any initial probability distribution, after applying P infinitely many 
times. For simplicity, we assume that the Markov chain is reversible, meaning that the condition 
T^xPxy = K y p yx is fulfilled for all distinct x and y. The largest eigenvalue of the matrix P is 
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Ao = 1. The corresponding eigenvector is equal to the stationary distribution tt. How fast a given 
Markov chain approaches tt is governed by the second eigenvalue Ai of P (which is strictly less than 
1), or viewed alternatively, by the eigenvalue gap 8 = 1 — Ai of the matrix P. This determines 
the performance of random walk based algorithms whose goal is to sample from the stationary 
distribution tt. 

In pQ, Szegedy defined a quantum walk as a quantum analogue of a classical Markov chain. 
Each step of the quantum walk needs to be unitary, and it is convenient to define it on a quantum 
system with two registers TL = TLl <8> TLr. The quantum update rule, defined in [14], is any unitary 
that acts as 

u \ x )l\°)r = \ x )l^ZVp^\v)r C 1 ) 
y 

on inputs of the form \x) L \0) R for all x £ £. (Its action on inputs \x) L \y ^ 0) R can be chosen 
arbitrarily.) Using such U, we define two subspaces of TL. First, 

A = SV ^{U\x) L \Q) R } (2) 

is the span of all vectors we get from acting with U on \x) L \0) R for all x 6 S , and second, the 
subspace B = SA is the subspace we get by swapping the two registers of A. Using the quantum 
update, we can implement a reflection about the subspace A as 

ReU = C/(2|0)(0| i? -I)C/t. (3) 
Szegedy defined a step of the quantum walk as 

W = Refe • Ref.4, (4) 
a composition of the two reflections about A and £>. This operation is unitary, and the state 

\^n) = ^2^2^/^\x) 1 \y) 2 , (5) 

x y 

where tt is the stationary distribution of P, is an eigenvector of W with eigenvalue 1. Szegedy pQ 
proved^ that when we parametrize the eigenvalues of W as e tnSi , the second smallest phase 9\ (after 
6q = 0) is related to the second largest eigenvalue Ai of P as \6\\ > \J\ — \\. This can be viewed as 
a square-root relationship A > yS between the phase gap A = \0i — 9q\ of the unitary operator W 
and the spectral gap S = |Ao — Ai| of P. This relationship is at the heart of the quantum speedups 
of quantum walk based algorithms over their classical counterparts. 

Many of the recent quantum walk algorithms for searching [12^ [T3"l [14], [T5] , evaluating formulas 
and span programs [HI [T71 [25] , quantum simulated annealing [18] , quantum sampling |T9l [20] and 
approximating partition functions based on classical Markov chains [9] can be viewed in Szegedy's 
generalized quantum walk model. For all these algorithms, an essential step in implementing the 
quantum walk W is the ability to implement the quantum update rule ([T]). For the basic search- 
like and combinatorial algorithms with low-degree underlying graphs, an efficient implementation of 
the corresponding quantum walks is straightforward. However, for complicated transition schemes 

x Nagaj et al. give a simpler way to prove this relationship using Jordan's lemma in [11] , 
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coming from Markov chains like those for simulated annealing or for approximating partition func- 
tions of the Potts model, the situation is not so clear-cut. The standard polynomial speed-ups of 
these quantum algorithms are viewed in terms of how many times we have to apply the quantum 
walk operator versus the number of times we have to apply one step of the classical random walk 
(Markov chain). However, a finer notion of complexity including the number of elementary gates it 
takes to implement each step of the quantum walk is needed here. Our work addresses the question 
whether it is possible to apply the steps of these quantum walk-based algorithms efficiently enough 
so as not to destroy the polynomial speedups. 

In Section [21 we review the recent alternative approaches to the implementation of U, such as 
those relying on efficient simulation of sparse Hamiltonians |21j. We find that they either scale 
quadratically in the sparsity parameter d, or polynomially in =, where e is the allowed error in the 
implementation of U. When there is only a small number of neighbors connected to each state x, or 
we do not need to use many steps of the quantum walk so that we can tolerate more implementation 
error, one could use these methods. However, the subtle algorithms like |9j require many precise 
uses of U which couple many (a number growing with the system size) neighboring states. In 
Appendix [A] we show a particular example (a first step towards a possible future quantum version 
of the classical algorithm for approximating the permanent [7] ) , where the alternative approaches 
to U destroy the polynomial speedup of the quantum algorithm. This is why we developed our new 
method, scaling linearly in the sparsity parameter d and polynomially in log -. 

Our general approach to the implementation of quantum walks based on sparse classical Markov 
chains is based on Grover and Rudolph's method of preparing states corresponding to efficiently 
integrable probability distributions [2]. In our case, the quantum samples we need to prepare 
correspond to probability distributions that are supported on at most d states of 8, which implies 
that they are efficiently integrable. Thus, we can use the method [2] to obtain an efficient circuit 
for the quantum update. The basic trick underlying Grover and Rudolph's method, preparing 
superpositions by subsequent rotations, was first proposed by Zalka [22]. Note that Childs [23] . 
investigating the relationship between continuous-time [23] and discrete-time |26] quantum walks, 
also proposed to use [2], also for some quantum walks with non-sparse underlying graphs. 

This is our main result about the quantum update rule U, the essential ingredient in the 
implementation of the quantum walk defined as the quantum analogue of the original Markov 
chain: 

Theorem 1 (An Efficient Quantum Update Rule). Consider a reversible Markov chain on the 
state space 8, with \£\ = 2 m , with a transition matrix P = (p X y)x,ye£- Assume that 

1. there are at most d possible transitions from each state (P is sparse), 

2. the transition probabilities p xy are given with t-bit precision, with t = Q (log - + logd), 

3. we have access to a reversible circuit returning the list of (at most d) neighbors of the state x 
(according to P), which can be turned into an efficient quantum circuit N: 



4- we have access to a reversible circuit which can be turned into an efficient quantum circuit T 




(6) 



acting as 




(7) 
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Then there exists an efficient quantum circuit U simulating the quantum update rule 

U\x)\0) = \x)J2^\y), 



(8) 



where the sum over y is over the neighbors of x, and p xy are the elements of P, with precision 



U-U) \x) <g> |0> 



< e 



(9) 



for all x G £ , with required resources scaling linearly in m, polynomially in log | and linearly in d 
(with an additional poly (\ogd) factor). 

The paper is organized as follows. In Section we describe the alternative approaches one 
could take to implement the quantum update and discuss their efficiency. In Section [3] we present 
our algorithm based on Grover & Rudolph's state preparation method. We conclude our discussion 
in Section^ In Appendix[Al we give an example where our approach is better than the alternative 
methods, and finally, we present the remaining details for the quantum update circuit, its required 
resources, and its implementation in Appendix [Bj 



2 Alternative Ways of Implementing the Quantum Update 

Before we give our efficient method, we review the alternative approaches in more detail. We know 
of three other ways how one could think of implementing the quantum update. The first two 
are based on techniques for simulating Hamiltonian time evolutions, while the third uses a novel 
technique for implementing combinatorially block-diagonal unitaries. 

The first method is to directly realize the reflection Ref.4 as exp(— iYlj^r) for time r = ^, where 
the projector II^ onto the subspace A turns out to be a sparse Hamiltonian. Observe that the 
projector 

xes y,y'e£ 

is a sparse Hamiltonian provided that P is sparse. Thus, we can approximately implement the 
reflection Ref.4 by simulating the time evolution according to H = for the time r = \. The 
same methods apply to the reflection Refg, so we can approximately implement the quantum 
walk W(P), which is a product of these two reflections. The requirements of this method scale 
polynomially in -, where e is the desired accuracy of the unitary quantum update. Moreover, the 
number of gates used in each U scales at least linearly with d and m. 

The second approach is to apply novel general techniques for implementing arbitrary row- 
and-column-sparse unitaries, due to Childs [27] and Jordan and Wocjan [28]. Similarly to the first 
method, it relies on simulating a sparse Hamiltonian for a particular time. However, the complexity 
of this method again scales polynomially in ~ (and linearly in d and m). 

The third alternative is to utilize techniques for implementing combinatorially block-diagonal 
unitary matrices. A (unitary) matrix M is called combinatorially block-diagonal if there exists a 
permutation matrix P (i.e., a unitary matrix with entries and 1) such that 

B 

PMP- 1 = M b 
6=1 
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and the sizes of the blocks M& are bounded from above by some small d. The method works as 
follows: each x G £ can be represented by the pair {b(x),p(x)}, where b(x) denotes the block 
number of x and p(x) denotes the position of x inside the block b(x). The unitary M can then be 
realized by 

1. the basis change \x) \— > \b{x)) (g) \p(x)), 

2. the controlled operation Ylb=i \b)(b\ ® -^6 > aim 

3. the basis change |6(x)) (8) |p(x)) i— > \x). 

The transformations M& can be implemented using 0(d 2 ) elementary gates based on the decompo- 
sition of unitaries into a product of two-level matrices [29]. The special case d = 2 is worked out in 
the paper by Aharonov and Ta-Shma [30]. The reflection Ref^ = 2II4 — I then has the form 

where Sy t yi = 1 for y = y' and otherwise. Viewed in this form, we see that Ref^ is a combinatorially 
block-diagonal unitary matrix, with a block decomposition with respect to the 'macro' coordinate 
x. Inside each 'macro' block labeled by x, we obtain a 'micro' block of size d corresponding to all 
y with p xy > and many 'micro' blocks of size 1 corresponding to all y with p xy = after a simple 
permutation of the rows and columns. The disadvantage of this way of implementing quantum 
walks is that its complexity scales quadratically with d (and linearly in m and log -), the maximum 
number of neighbors for each state x. 

In the next Section, we show how to implement the quantum update rule by a circuit with the 
number of operations scaling linearly with the sparsity parameter d (with additional poly(logd) 
factors), linearly in m = log \£\ and polynomially in log -. 




3 Overview of the Quantum Algorithm 

Our efficient circuit for the Quantum Update Rule 

d-l 

u\x) l \q) r = \x) l Y,^p^\v!) r (10) 

i=0 

works in the following way: 

1. Looking at x in the 'left' register, put a list of its (at most d) neighbors yf into an extra 
register and the corresponding transition probabilities p xy ? into another extra register. 

2. Using the list of probabilities, prepare the superposition 

d-l 

E^i*)s ( n ) 

i=0 

in an extra 'superposition' register S. 
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Figure 1: The scheme for preparing the superposition Ya=o V qf° gd ^ \i) in logd rounds. 

3. Using the list of neighbors, put X)to \/Pxyf \vf ) r K)s m * ne registers R and 5. 

4. Clean up the S register using the list of neighbors of x and uncompute the transition proba- 
bility list and the neighbor list. 

We already assumed we can implement Step 1 of this algorithm efficiently The second, crucial 
step is described in Section 13.11 Additional details for steps 3 and 4 are spelled out in Appendix 
iBl Finally, the cleanup step 4 is possible because of the unitarity of step 1. 

3.1 Preparing Superpositions a la Grover and Rudolph 

The main difficulty is the efficient preparation of (jlip . We start with a list of transition proba- 
bilities {p xy x,0 < i < d — 1} with the normalization property Yli=o Pxy x = ^- Our approach is 
an application of the powerful general procedure of [2]. The idea is to build the superposition up 
in log d rounds of doubling the number of terms in the superposition (see Figure [T]) . Each round 
involves one of the qubits in the register S, to which we apply a rotation depending on the state of 
qubits which we have already touched. 

For simplicity, let us first assume all points x have exactly d neighbors and that all transition 
probabilities p xy ? are nonzero, and deal with the general case in Section 13.21 To clean up the 

notation, denote qi = p X yf- Working up from the last row in Figure [T] where qf 05 ^ = qi, we first 

compute the d — 1 numbers q^ for i = 0, . . . , 2 k — 1 and k = 0, . . . , (log d) — 1 from 

.r^+sS-, (12) 

The transition probabilities sum to 1, so we end with q^ = 1 at the top. 

Our goal is to prepare \ipi gd) = Yli=o V^^) - ^ e s ^ ar ^ with logd qubits in the state 

l^o) = |0) 1 |0) 2 ---|0) logd . (13) 

In the first round we prepare 

IV-i) = (7^I°>1 + 1!>0 l°>2 • • • |0)io gd (14) 
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by applying a rotation to the first qubit. A rotation 



R(0) 



COS ( 

sin ( 



- smi 
cos ( 



(15) 



by = cos 1 V #0^ does this job. In the second round, we apply a rotation to the second qubit. 
However, the amount of rotation now has to depend on the state of the first qubit. When the first 
qubit is |0), we apply a rotation by 



q(2) 

1 o 



cos 



(2) 
%_ 

(1)' 
% 



(16) 



Analogously, when the first qubit is |1), we choose 



,(2) 



cos 



(2) 
<?2 



(i)' 



(17) 



Observe that the second round turns (1141) into 



1^2) 



|oo> li2 + v^Floi)i )2 + V^l 10 )i,2 + y^Fl 11 )^) |o)j 



|0) 



logci • 



(18) 



Let us generalize this procedure. Before the j-th. round, the qubits j and higher are still in the 
state |0), while the first j — 1 qubits tell us where in the tree (see Figure [1]) we are. In round j, we 
thus need to rotate the j-th qubit by 



cos 



(19) 



depending on the state \i) which is encoded in binary in the first j — 1 qubits of the 'superposition' 
register S. 

Applying log d rounds of this procedure results in preparing the desired superposition (fTT|) . with 
the states \i) encoded in binary in the logd qubits. 



3.2 A nonuniform case 

In Section 13. 1\ we assumed each x had exactly d neighbors it could transition to. To deal with 
having fewer neighbors (and zero transition probabilities), we only need to add an extra 'flag' 
register F{ for each of the d neighbors yf in the neighbor list. This 'flag' will be if the transition 
probability p xy -? is zero. Conditioning the operations in steps 2-4 of our algorithm (see Section [3]) 
on these 'flag' registers will deal with the nonuniform case as well. 



3.3 Precision requirements 

We assumed that each of the probabilities p xy ^ was given with t-bit precision. Our goal was to 
produce a quantum sample (jlip whose amplitudes would be precise to t bits as well. Let us 
investigate how much precision we need in our circuit to achieve this. 
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For any x, the imperfections in q° gd = p xy * (see Section f3. 1 j) come from the logd rotations by 
imperfectly calculated angles 9. The argument of the inverse cosine in (|19p 



'hi 

Ji-i) 



(20) 



obeys < a~p < 1. The errors in the rotations are the largest for af close to or 1 (i.e. when 
the 0's are close to 5 or 0). To get a better handle on these errors, we introduce extra flag qubits 



signaling of' = or of' = 1 (see Appendix [B] for details). In these two special cases, the rotation 
by 8 becomes an identity or a simple bit flip. On the other hand, because the g's are given with t 
bits, for a's bounded away from and 1, we have 



2_* 

T 



< a < 



1-2-* 



1 



(21) 



We choose to use an n-bit precision circuit for computing the a's, guaranteeing that \a — a\ < 2 n . 
Using the Taylor expansion, we bound the errors on the angles 9: 



\9-9\ 



cos 1 a — cos 1 a\ 



,_ . dcos l a 

(a -a) + ... 

da 



2~ n t 
< ri < Cl 2- n+ 2. 



(22) 



because a is bounded away from 1 as (I21j) , 

Each amplitude in (jlip comes from multiplying out log d terms of the form cos 9\ or sin 9\ . For 
our range of 0's, the error in each sine or cosine is upper bounded by 



sin ( 



sin0|<|0-0|, |cos<9-cos0| < \9 - t 
Therefore, the final error in each final amplitude is upper bounded by 



A; 



< Cl (logd)2- n+ i 



(23) 



(24) 



Note that the factor logd is small. Therefore, to ensure t-bit precision for the final amplitudes, 
it is enough to work with n = |f + 0(1) bits of precision during the computation of the 0's. We 
conclude that our circuit can be implemented efficiently and keep the required precision. 



4 Conclusion 



The problem of constructing explicit efficient quantum circuits for implementing arbitrary sparse 
quantum walks has not been considered in detail in the literature so far. We were interested in 
an efficient implementation of a step of a quantum walk and finding one with a favorable scaling 
of the number of required operations with d (the sparsity parameter) and the accuracy parameter 
=■. Its intended use are algorithms based on quantum walks with polynomial speedups over their 
classical Markov Chain counterparts. 

We showed how to efficiently implement a genera^ quantum walk W(P) derived from an ar- 
bitrary sparse classical random walk P = {p X y)x,ye£- We constructed a quantum circuit U that 

2 Of course, much more efficient approaches exist for specific walks (e.g. those on regular, constant-degree graphs). 
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approximately implements the quantum update rule (JSj) with circuit complexity scaling only lin- 
early (with additional logarithmic factors) in d, the degree of sparseness of P, linearly in m = log \£\ 
and polynomially in log - , where e denotes the desired approximation accuracy ([9]) . 

It has been known that quantum walks could be implemented using techniques for simulating 
Hamiltonian time evolutions. However, the complexity would grow polynomially in | if we were 
to rely on simulating Hamiltonian dynamics (see Section [2]). This would be fatal for quantum 
algorithms such as the one for estimating partition functions in [9] or future algorithms for approx- 
imating the permanent, losing the polynomial quantum speed-ups over their classical counterparts. 
An alternative for implementing quantum walks whose running complexity scales logarithmically in 
^ exists. It relies on the implementation of combinatorially block-diagonal unitaries. However, its 
running time grows quadratically in d (see Section [2]). When the sparsity of the walk d grows with 
the system size n, this brings an extra factor of n to the complexity of the algorithms, destroying 
or decreasing its polynomial speedup. This is true e.g. for the example given in Appendix O 
Therefore, our approach to the quantum update is again more suitable for this task. 
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A Applications 

A.l Approximating the Permanent: Where Sparsity and Accuracy Matter 

In this Appendix we present a particular example of a quantum algorithm with a polynomial 
speedup over its classical counterpart, requiring our efficient approach to implementing quantum 
walks. The example is a rather naive quantization of the classical algorithm for approximating the 
permanent of a matrix 

n 

per(A)=J2H a i,a(i), (25) 

a i=l 

where a runs all over the permutations of [1, . . . , n\. For a 0/1 matrix A, the permanent of A is 
exactly the number of perfect matchings in the bipartite graph with bipartite adjacency matrix A. 
A classical FPRAS (fully polynomial randomized approximation scheme) [7j for this task involves 
taking O* (ra 7 ) steps of a Markov chain (here O* means up to logarithmic factors). It produces an 
approximation to the permanent within [(1 — rj)per(A), (1 + rj)per(A)] by using 

1. £ = 0*(n) stages of simulated annealing, 

2. at each stage, generating S = O* (n 2 ) samples from a particular Markov chain, 
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3. T = O* (n 4 ) Markov chain invocations to generate a sample from its approximate steady 
state. 

The failure probability of each stage is set to fj = o(l/m 4 ) so that r\ = tfj is small. Hence, the 
total complexity (number of Markov chain steps used) is 1ST = O* (n 7 ) . 

The sparsity parameter d of the Markov chains involved scales with the problem size m. There- 
fore, the dependence of the implementation of the corresponding quantum walk on d becomes 
significant. Furthermore, because of the many stages of simulated annealing and sampling, the er- 
ror e in implementation of each quantum walk operator needs to smaller than one over the number 
of quantum walk steps involved. 

The simplest quantized algorithm uses a quantum walk instead of the Markov Chain, and 
requires O* (n 5 ) steps of a quantum walk, as the mixing of the quantum walk requires only y/T = 
O* (ra 2 ) steps. However, it is important to choose an efficient circuit to implement each step of the 
quantum walk. A bad choice could destroy the speedup. 

Let us compare what happens when this algorithms utilizes the different methods for quantum 
walk implementation as subroutines, counting the number of required elementary gates. Note that 
in this counting, all of the methods (classical and quantum) we will mention share a common factor 
m (the log of the state space size). However, the scaling in d (the sparsity parameter) and = 
(precision) is what distinguishes them. 

Let us look at the alternative approaches given in Section [21 and show that the small n 2 
polynomial speedup is lost. The first two of these approaches scale with -. This brings an extra 
i oc y/T oc n 2 factor to the complexity of the algorithm, destroying the speedup. The third 
alternative uses O* (d 2 ) elementary gates, adding an extra factor of d 2 = n 2 , again destroying the 
speedup. On the other hand, our method uses only O* (d) = n gates (the scaling coming from 
precision requirements only adds logarithmic factors), and we thus retain some of the quantum 
advantage. 

This example was just an illustration of a scenario, where our efficient implementation of a 
quantum walk (see Section [3|) is necessary. However, we see its use in a future much better quantum 
algorithm for approximating the permanent, using not only quantum walks, but also quantizing 
the sampling/counting subroutine as in [9j. 

B Additional Details for the Efficient Quantum Update Circuit 

In this Appendix, we spell out additional details for our Quantum Update circuit as well as draw 
the circuit out for a d = 4. 

The state space of the classical Markov chain P is £, with \£\ = 2 m . The entries of P are p xy , 
the transition probabilities from state x to state y. We assume that P is sparse, i.e. that for each 
x £ £ there are at most d neighbors yf such that p xy ? > 0, and their number is small, i.e. d<C2 m . 
Since d is a constant, we can assume without loss of generality that d = 2 r . We want to implement 
the quantum (|8j). where \x) £ C 2m . 

B.l Preparation 

Classically, our knowledge of P can be encoded into efficient reversible circuits outputting the 
neighbors and transition probabilities for the point x. We will use quantum versions N and T of 
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Figure 2: The Determine Angle Circuit DAC. 
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Figure 3: The circuit SC handling special cases. 

these circuits, with the following properties. The neighbor circuit N acts on d copies of and 
produces a list of neighbors of x as 



N\x) L \0)® d = \x) L ®\y* )...\yU) 



(26) 



All the transition probabilities p xy ? are given with i-bit precision. The transition probability 

circuit T acts on a register holding a state \x) and d copies of (C 2 )®', producing a list of transition 
probabilities for neighbors of \x) as 



T\x) L W 



\x)l® \Pxy%)--- \Pxyl_-)- 



(27) 



To simplify the notation, let us label qi = p X yf- We now prepare all the terms q\ , filling the tree 



in Figure ([T]). Starting from q\ 



(logo!) 



qi, we use an adding circuit (ADD) doing the operation 



Qqa + ?2i+i- ^ ne probability distribution {gj} is efficiently integrable, so filling the tree 

of q^ is easy, and we can use Grover and Rudolph's method [2] of preparing quantum samples for 
such probability distributions. 

B.2 Determining the rotation angles 

After the preparation described in the previous Section, we need to compute the appropriate rota- 

~(k) 

tion angles 0\ for Grover and Rudolph's method. For this, we use the Determine Angle Circuit 
(DAC). This circuit produces 



cos 



N 



(fc) 

,(*-!) ' 



(28) 
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while also handling the special cases qffl = qf ^ and q^ k ^ = 0. For simplicity, let us label 
b = q- k , c = . The DAC circuit first checks the special cases, and then, conditioned on 
the state of the two two flag qubits, computes ([28]) . We draw it in Figure [21 with the special 
case-analysing circuit SC given in Figure lBT2l Here EQ is a subroutine testing whether two qubits 
(in computational basis states) are the same. The first EQ tests the states |0) and |c), while the 
second EQ runs the test on | c) and | b) . We have the following four scenarios depending on the flag 
qubits 

00 the circuit 9 computes normally , 

01, 11 the circuit 9 does nothing (keeps angle = 0, as b = c) , (29) 
10 the circuit 9 outputs 9 = tt/2, as c = 0. 

The third option corresponds to c = 0, when all the probability in the next layer of the tree is 
concentrated in the right branch. We then simply flip the superposition qubit, using 9 = ^. 

B.3 Creating superpositions and mapping 

After the angle is determined, we apply the corresponding rotation to the appropriate qubit in the 
superposition register S, as described in Section 13.11 We then uncompute the rotation angle. 

Once the final superposition is created in S, we invoke a mapping circuit M. This M acts on 
the register holding the names of the d neighbors of x, the superposition register, and the output 
register R. It takes yj, the name of the j'-th neighbor of x, and puts it into the output register as 

M\0)r ® \y%) ® . . . ® ® \j) s = \y-) R ® \Vo) ® • • • ® \Vd-i) ® \j)s- (30) 

We can do this, because the names of the states in £ are given as computational basis states. The 
next step is to uncompute the label j in the last register with a cleaning circuit C as 

ri«^ D «iu^« (55 \ii x )6t\i)-! Iyf>«® l^o)®---® bd-i>®b') if ^i (o } ) 

G\ yi ) R ®\y }®...® l^-i) ® \3) - | {yf)R g g, g, | 0) a { = (31) 

These two steps transferred the superposition from the register S (with r = \ogd qubits), into the 
output register R (which has m qubits). The final step of our procedure is to uncompute (clean 
up) the lists of neighbors and transition probabilities. 

B.4 The required resources 

Let us count the number of qubits and operations required for our quantum update rule U based on 
a G?-sparse stochastic transition matrix P. The number of ancillae required is Q(dm+dt), where 2 m 
is the size of the state space and t is the required precision of the transition probabilities. Moreover, 
the required number of operations scales like £l(dr mae), where r = logd and ag is the number 
of operations required to compute the angle 9 with n = f2(i)-bit precision. Finally, when we have 
t-bit precision of the final amplitudes, the precision of the unitary we applied is 

U - U\ \x) ® |0) < e, (32) 

for any x G £ when t = O (log d + log ~) . The total number of operations in our circuit thus scales 
like 

nimd poly (log d) + md (log d) poly ( log - J J . (33) 
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Table 1: Required numbers of qubits 



Register Type 


Required number of qubits 


x (register L) 
y (register R) 
yf (neighbor list) 
qi's (probabilities) 
flag qubits 
9 (rotation angle) 
ancillae for computing 
superposition register S 


m 
m 

d x m 

(2d -2) xt 

2 

n = f + 0(l) 

a e = poly(n) = poly(i) 

r = log d 



\4 2) ) -f 



i^ 2) > 



(2)\ 



— ADD 



\9i 
l«P>> 
l<V> 

|1> 

|oo> /(9 

\0)e 
0> 

10) 



ADD 



DAC 




DAC ] 











R{9) 






c 



DAC 




DAC* 




— i 


» 



DAC 



DAC 



R{6) 



R{6) 



Figure 4: The efficient Quantum Update, creating the superposition (lllh for d = 4. The bottom 
two lines represent the 'superposition' register S. 

Besides the registers for the input \x)l and output |0)#, we need d registers (with m qubits) to 
hold the names of the neighbors of x, and 2d — 2 registers (with t qubits) to store the transition 
probabilities qi. The DAC circuit requires two extra flag qubits and a register with n = tt + 0(1) 
qubits to store the angle 6. Computing the angle 6 requires a circuit with poly(n) qubits. Finally, 
the superposition register S holds r qubits. These requirements are summed in Tabled! 

To conclude, we draw out the superposition-creating part of the quantum update for d = 4 in 
Figured! The first two lines represent the superposition register S, in which we prepare 

\<p) = v^f |oo) + v^Floi) + v^f Ho> + v^F|ii)=£Vft|i>- (34) 

i=0 
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